A Source Dependency Model for Statistical Machine Translation

نویسندگان

  • Deyi Xiong
  • Min Zhang
  • Haizhou Li
چکیده

In the formally syntax-based MT, a hierarchical tree generated by synchronous CFG rules associates the source sentence with the target sentence. In this paper, we propose a source dependency model to estimate the probability of the hierarchical tree generated in decoding. We develop this source dependency model from word-aligned corpus, without using any linguistically motivated parsing. Our experimental results show that integrating the source dependency model into the formally syntax-based machine translation significantly improves the performance on Chinese-to-English translation tasks.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A new model for persian multi-part words edition based on statistical machine translation

Multi-part words in English language are hyphenated and hyphen is used to separate different parts. Persian language consists of multi-part words as well. Based on Persian morphology, half-space character is needed to separate parts of multi-part words where in many cases people incorrectly use space character instead of half-space character. This common incorrectly use of space leads to some s...

متن کامل

A Dependency Treelet String Correspondence Model for Statistical Machine Translation

This paper describes a novel model using dependency structures on the source side for syntax-based statistical machine translation: Dependency Treelet String Correspondence Model (DTSC). The DTSC model maps source dependency structures to target strings. In this model translation pairs of source treelets and target strings with their word alignments are learned automatically from the parsed and...

متن کامل

Part-of-Speech Induction in Dependency Trees for Statistical Machine Translation

This paper proposes a nonparametric Bayesian method for inducing Part-ofSpeech (POS) tags in dependency trees to improve the performance of statistical machine translation (SMT). In particular, we extend the monolingual infinite tree model (Finkel et al., 2007) to a bilingual scenario: each hidden state (POS tag) of a source-side dependency tree emits a source word together with its aligned tar...

متن کامل

Dependency-Based Bracketing Transduction Grammar for Statistical Machine Translation

In this paper, we propose a novel dependency-based bracketing transduction grammar for statistical machine translation, which converts a source sentence into a target dependency tree. Different from conventional bracketing transduction grammar models, we encode target dependency information into our lexical rules directly, and then we employ two different maximum entropy models to determine the...

متن کامل

Dependency Treelet Translation: Syntactically Informed Phrasal SMT

We describe a novel approach to statistical machine translation that combines syntactic information in the source language with recent advances in phrasal translation. This method requires a source-language dependency parser, target language word segmentation and an unsupervised word alignment component. We align a parallel corpus, project the source dependency parse onto the target sentence, e...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009